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ABSTRACT 

Available are n independent observations (continuous data) that are 
believed to be a random sample. Desired are distribution-free confidence 
intervals and significance tests for the population median. However, 
there is the possibility that either the smallest or the largest observ- 
ation is an outlier. Then, use of 3 procedure for rejection of an out- 
lying observation might seem appropriate. Such a procedure would con- 
sider that two alternative situations are possible and would select one 
of them. Either (1) then observations are truly a random sample, or 
(2) an outlier exists and its removal leaves a random sample of size 

For either situation, confidence intervals and tests are desired 
for the median of the population yielding the random sample. Unfortun- 
ately, satisfactory rejection procedures of a distribution-free nature 
do not seem to be available.- Moreover, all rejection procedures impose 
undesirable conditional effects on the observations, and also, can 
ggj^ct the wrong one of the two above situations. Such difficulties / 

i 

could be bypassed if intervals and tests are used that simultaneously 
apply to both situations, i.e. if a confidence coefficient, or signifi- 
cance level, has the same value for both situations.. It is found that 
two-sided intervals and tests based on two symmetrically located order 
statistics (not the largest and smallest) 6f the n' observations have this 

property . / - ’ . . 

I ' 
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INTRODUCTION AND DISCUSSION 

The data are n independent observations that are continuous data and 
are believed to be a random sample. The order statistics of these obser- 
vations are 

x<l) £ x(2) £ ^ x(n) . 

Distribution-free confidence intervals and significance tests are desired 
for the median 8 (not necessarily unique) of the population sampled. 
However, the possibility exists that x(n) is an -outlier, or the possibil- 
ity exists that x(l) is an outlier. ^ That is, x(n) is so much larger 
than the other observations that there is doubt that it was produced 
by the population that produced the other n-1 observations. Alternatively, 
x(l) is so much smaller than the other observations that there is doubt 
that it came from the population that yielded the other n-1 observations. 

When such a doubt exists, use of a procedure for deciding on the 
rejection of an outlying observation might seem appropriate. A standard 
rejection procedure would consider that two situations are possible and, 
on the basis of the observations, would select one of these two situations 
(as that which occurs) . 

/ 

The n observations are truly a random sample for one of the two 
situations (with the median 6 of the associated population being investi- 
gated) . The doubtful observation is an outlier for the other situation. 
More specifically, the population yield.ing the suspected outlier is 
different from the population yielding^the other n-1 observations, and 
in such a way that removal of the outlier leaves a random sample of size 
jn addition, the population for the random sample obtained under 
these conditional circumstances is considered to be the same as the 
population that unconditionally yielded these n-1 observations. Then, 



distribution-free intervals and tests are desired for the median 9 of 
the population yielding the sample of size n-1 (for the situation where 
the doubtful observation is an outlier) . Also, when x(n) is an outlier, 
x(l) x (n-1) are the order statistics of the sample of size n-1, while 

x (2) , . . . , x (n) are the order statistics of this sample when x(l) is an 

outlier. 

Unfortunately, development of a satisfactory procedure for rejection 
of an outlier is a formidable problem for distribution- free cases. What 
represents a substantial deviation from the other observations depends 
strongly on the distribution tail (which can be of any continuous form in 
the distribution-free cases) . Even if a satisfactory rejection procedure 
could be developed, its use would involve important difficulties. First, 
the wrong one of the two situations might be selected. Second, use of- 
the rejection procedure would introduce undesirable conditional effects 
on the probability properties of the observations. For example, suppose 
that the n observations are truly a random sample. They will no longer 
be a random sample after being subjected to the rejection procedure, even 
if the correct situation is selected. That is, only those sets whose n 
observed values satisfy one or more requirements imposed by the procedure 

axe considered to be random samples. 

A more attractive approach would be to use intervals and tests that 

apply simultaneously to both situations. That is, a confidence interval 
has the same confidence coefficient for the two situations. Also, a 
test has the same significance level for both situations. Fortunately, 
intervals and tests with this property can be developed. In fact, the 
well-known equal-tail sign tests, and the corresponding two-sided confi- 
dence intervals, are shown to have this property (when x(l) and x(nj are 

4 


not used) . This is the case whether x(n) could be an outlier or whether 
x(l) can be an outlier. For convenience o£ presentation, only the confi- 
dence intervals are explicitly considered. However,. the property for the 
corresponding test follows in a direct fashion, since the tests can be 

obtained directly from the intervals. 

If the n observations were truly a random sample, the well-known 

confidence intervals defined by 


1 i_1 /n\ 

p[x(i) £ e £ x(n + 1 - i)l “ 1 - ,f o ( j ) 


(1) 


are applicable. These are the confidence intervals considered (for 
2 £ i < n/2) . The relationship (1) is found to hold when x(l> is an 
outlier and also, when x(n) is an outlier. Verification of this prop- 
erty is given in the next section. 

VERIFICATION 

only the situation where x(l) is an outlier receives consideration. 

* similar method provides verification that (1) holds when x(n) is an 

outlier. 

in general, the value of P[x(i> £ 6 £x(n + 1 - i> J can be expressed 
as unity minus 

P[x(i) > 9] + P[x(n + 1 - i) < e] • 

When x(l) is an outlier, x(2) becomes the smallest observation, etc. and 


i) > 9] = (*j) n 1 2^ » 


P[x (i) 


,i-l I ,\ 

•. . n-1 (n-l| 

p[x (n + 1 - i) < 9] = (*5) 2 \ a j ' 

j=0 \ / 




with their sum being 



where 



However, 





for 1 ^ j < 


Thus, the value of P[x(i) 


£ 0 £ x(n + 1 


i) ) is 




which is the value of (D • 

It is to be noticed that P[*(i> > 6) does not differ much from 
p[*<n + 1 - i, < el when i in of at beast moderate site ‘ordinarily 

, larae A desirable feature of 

implies that n is at least moderat Y 

. , • _ that the probability can be accurately deter 

the results presented is that tne pro 

mined for each tail of an interval or test. 


